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1.  Introduction 


Tactical  radios  are  held  to  more  stringent  requirements  than  radios  found  in  the 
eommercial  field.  They  must  adhere  to  a  higher  set  of  requirements,  whieh  allow 
them  to  better  operate  in  hostile  environments.  The  Brigade  Modernization 
Command,  in  conjunction  with  the  Army  Test  and  Evaluation  Command’s 
Operational  Test  Command,  eondueted  large-scale  test  events  sueh  as  the  Network 
Integration  Evaluation'  to  test  radios  in  relevant  taetieal  environments.  Radio 
networks  under  test  are  instrumented  to  record  traffic  transmitted  between  network 
nodes.  These  data  are  processed  and  analyzed  to  determine  how  well  a  single  radio 
or  the  whole  network  performed  in  the  test. 

The  collaboration  between  the  Aberdeen  Test  Center  and  the  US  Army  Research 
Eaboratory’s  Computational  and  Information  Sciences  Direetorate  resulted  in  a 
data  proeessing  system  that  reduees  the  eollected  traffie  into  manageable  data 
produets.  Analysts  extract  relevant  metrics  from  these  data  produets  to  support  the 
evaluation  of  the  system-under-test  performance. 

One  of  these  data  products,  Commsip,  is  related  to  paeket-level  analysis  and  is 
eritieal  to  network  evaluation.  This  data  produet  ineludes  statisties  such  as  the 
latency  and  completion  rates  between  network  nodes;  both  are  derived  by 
correlating  the  data  (paekets)  observed  at  eaeh  node  during  the  test.  Correlating 
over  1  billion  paekets  reeorded  during  eaeh  day  of  testing  was  a  foreing  factor  to 
employ  high-performanee-computing  (HPC)  assets  to  process  the  massive  volumes 
of  data  into  a  usable  data  model.^  This  report  explains  the  need  of  this  data  model 
as  well  as  the  eut  module^  within  the  HPC  framework^  that  ereates  it. 

2.  Motivation  and  Desired  Outputs 

Most  systems  send  data  across  a  network  eneapsulated  in  an  IP  packet.  The  IP  layer 
data  may  arrive  sueeessfully,  arrive  out  of  order,  or  be  dropped  during  transit.  The 
results  of  these  3  cases  are  needed  to  determine  several  aspeets  of  network 
performanee. 

IP  is  known  as  a  “best  effort”  delivery  protocol,  and  knowing  how  well  the  network 
delivers  IP  layer  data  provides  insights  into  how  well  the  network  performs  from 
the  end  user  perspective.  By  performing  IP  layer  packet  analysis,  one  ean  determine 
key  network  performanee  metrics,  such  as  data  delivery  latency,  load  handling,  and 
overall  delivery  completion  rates.  This  analysis  becomes  even  more  important 
when  the  network  is  in  the  tactical  domain  because  of  the  need  for  reliable  and 
seeure  networking. 
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To  evaluate  network  performanee,  the  Commsip  data  model  is  generated  from  raw 
colleeted  data.  The  model  captures  the  history  of  each  packet  and  includes  details 
from  each  end  point  that  it  was  observed  at. 

During  the  process  of  creating  the  Commsip  data  model,  it  is  possible  to  perform 
calculations  that  analysts  commonly  want  to  see  in  a  distributed  parallel  fashion 
that  can  significantly  reduce  the  time  it  takes  to  achieve  the  same  result  in  a  serially 
processed,  database-driven  analysis.  This  includes  packet  matching,  latency 
calculation,  packet  endpoint  determination,  and  the  filtering  out  of  local  network 
traffic,  which  is  of  no  interest  from  an  analysis  perspective. 

The  Test  and  Evaluation  community  has  determined  most  of  the  definitions  and 
layout  of  the  data  product,  which  can  be  seen  in  Table  A-1  (see  the  Appendix). 
Extra  columns  (refer  to  Table  A-2)  have  been  added  to  reduce  uimeeded  data 
reduction  for  other  cut  modules,  such  as  Transport,  which  draws  Transmission 
Control  Protocol  (TCP)-based  statistics.  Each  row  in  the  Commsip  table  typically 
represents  the  combination  of  2  observations  of  a  packet,  the  sending  side  and  the 
receiving  side.  In  some  cases  only  one  side  of  the  transmission  will  be  observed 
and  the  row  will  reflect  that  by  leaving  fields  empty  that  cannot  be  calculated 
without  both  observations.  An  example  would  be  calculating  latency  where  you 
must  have  both  sides. 

3.  Data  Organization 

This  section  describes  in  detail  what  reductions  and  manipulations  occur  within  the 
Commsip  cut  module.  The  module  takes  2  different  types  of  input  data  cuts.  The 
cuts  come  from  Binary  Large  Object  (BLOb)  files  and/or  Packet  Capture  (PCAP) 
files.  During  the  module’s  Process  stage,  important  information  is  pulled  from 
packets  and  saved  in  a  temporary  data  store.  This  information  is  then  read  in  during 
the  module’s  Crunch  stage,  where  packet  matching  calculations  on  the  data  occur. 
This  simplified  data  are  then  turned  into  the  Commsip  Data  Product. 

4.  Prepare  File 

Pile  metadata  must  be  collected  to  properly  organize  the  information  found  in  data 
cuts.  The  device  ID,  file  ID,  and,  in  the  case  of  a  PCAP  file,  the  recording  source 
are  all  required  for  correlating  packet  data  across  nodes  in  a  network. 

The  device  ID  is  an  identifier  used  in  mapping  an  Advanced  Distributed  Modular 
Acquisition  System  (ADMAS)  to  its  recorded  data.  The  file  metadata  provides  a 
serial  number  that  is  used  to  look  up  the  device  ID  in  a  predefined  reference. 
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The  file  ID  is  a  global  identifier  that  represents  the  file.  Each  HPC  processing  core 
is  able  to  perform  a  lookup  on  a  file  ID  and  uniquely  access  the  same  file. 

5.  Commsip  Process 

As  the  file  parser  begins  iterating  cuts,  they  are  passed  into  Commsip ’s  Process 
method.  Though  BLOb  Nettap  cuts^  and  PCAP  cuts  are  roughly  the  same,  BLOb 
cuts  contain  more  cut  metadata.  The  metadata  for  each  Nettap  cut  contains  the 
network  source  data  stream  it  was  recorded  from,  in  comparison  with  the  PCAP 
whose  data  are  from  only  one  network  source.  BLOb  Nettap  cut  metadata  also 
contains  information  about  whether  or  not  the  data  collection  device  experienced 
an  overflow  error  (resulting  in  unrecorded  data),  what  type  of  overflow  error  it  was, 
and  which  interface  it  was  recorded  on.  If  an  overflow  error  occurs,  the  data  in  the 
cut  may  be  corrupted,  and  for  this  reason,  the  cut  is  ignored. 

When  the  Commsip  cut  module  receives  a  cut,  it  decodes  the  cut’s  payload  to 
extract  the  full  Ethernet  packet  contained  within  and  checks  to  verify  that  the  packet 
should  be  processed.  One  check  compares  the  packet’s  collection  time  to  the 
evaluation  time  window  specified  in  the  user-set  configuration  file.  Another  check 
ensures  the  packet  was  collected  from  a  known  source  and  on  a  tap  being 
considered  in  the  data  reduction.  Only  packets  that  pass  all  checks  are  considered 
for  the  remainder  of  the  reduction  process. 

Erom  here,  each  verified  Ethernet  packet  has  its  EtherType^  decoded.  All  packets 
with  EtherTypes  that  are  not  IPV4  get  dropped  because  of  prior  knowledge  that  the 
evaluations  will  only  be  performed  on  IPV4  data.  Tunneled  packets  are  then  broken 
down  into  their  inner  and  outer  IP  layers  by  decoding  any  tunnel  protocols,  such  as 
the  Generic  Routing  Encapsulation  protocol  that  may  be  encapsulating  the  packet. 

Packets  with  the  outer  IP  layer’s  time  to  live  (TTL)  less  than  2  get  dropped  because 
of  the  location  on  the  network  of  where  the  ADMAS  records  traffic.  The  packet’s 
TTL  will  typically  drop  by  2  or  more  when  traveling  over  the  air.  Thus,  this  data 
may  get  recorded  by  the  ADMAS  but  will  get  dropped  before  reaching  the 
destination  device. 

6.  Collection  Point 


The  collection  point  is  the  location  where  the  ADMAS  is  observing  traffic  on  the 
network.  There  can  be  many  collection  points  on  one  node;  each  is  given  a  letter 
designation.  Eor  example,  the  collection  point  “X”  is  located  on  the  switch  port 
analyzer  (SPAN)  port  of  the  router  facing  the  over-the-air  (OTA)  radio.  Thus,  any 
traffic  coming  in  or  out  of  the  OTA  radio  will  end  up  being  copied  to  the  ADMAS. 
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The  location  of  the  collection  point  is  vital  to  the  analysis.  For  instance,  if  one  side 
of  the  network  collects  data  behind  an  encrypter  and  the  other  side  collector  is 
before  the  encrypter,  there  is  absolutely  no  way  to  match  the  packets. 

7.  Direction 


The  location  of  the  collection  point  is  an  important  part  of  creating  heuristics  to 
determine  the  direction  of  packet  transmission.  Each  collection  point  may  use  a 
different  heuristic,  and  the  heuristic  may  change  from  event  to  event.  Some  of  the 
simple  cases  rely  on  prior  knowledge  of  the  testing  setup  to  determine  the  direction. 
More  complicated  cases  involve  an  extra  processing  phase  to  gather  more 
information  to  make  the  determination. 

For  example,  collection  point  “B”  as  used  in  most  of  the  Warfighter  Information 
Network-Tactical  test  events  was  a  simple  heuristic  using  a  simple  check  on  the 
media  access  control  (MAC)  addresses  in  a  packet’s  Ethernet  header  to  determine 
if  the  packet  is  inbound  or  outbound.  The  collection  point  “B”  heuristic  states  that 
if  the  packet’s  source  MAC  address  is  an  inline  network  encrypter  MAC  address, 
it  is  considered  inbound.  Alternately,  if  the  destination  MAC  address  is  either 
multicast  or  the  network  encrypter,  then  it  is  outbound. 

Some  of  the  heuristics  will  use  Virtual  Local  Area  Network  identifiers  and  MAC 
addresses  to  determine  direction.  There  are  many  other  heuristics  that  change  from 
event  to  event,  but  they  will  not  be  covered  in  this  report. 

8.  Detecting  Local  Traffic 

Local  packets  are  those  that  transit  from  one  device  to  another  on  the  same  network 
node.  An  example  of  local  network  traffic  would  be  a  vehicular  router  pinging  a 
collocated  radio  device  to  see  if  it  responds. 

A  packet  can  be  either  transmitting  between  nodes  or  transmitting  entirely  within  a 
single  node.  Since  the  Commsip  Data  Model  strictly  contains  packets  that  transmit 
between  nodes,  local  traffic  must  be  filtered  out.  Packets  that  have  a  source  and 
destination  on  the  same  node  or  a  packet  direction  that  cannot  be  determined  are 
assumed  local  and  removed  from  processing.  This  check  is  sometimes  used  to 
determine  the  packet  direction;  however,  it  is  generally  used  during  the  Crunch 
stage  of  the  reduction. 
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9.  Hash 


To  conduct  efficient  packet  matching  and  calculate  the  latency,  there  needs  to  be  a 
common  key  between  packets  recorded  on  different  devices.  To  generate  this 
common  key,  a  packet’s  mutable  fields  are  removed  since  they  can  change  between 
the  sending  side  and  the  reeeiving  side.  Next,  a  hashing  algorithm  is  applied  to  the 
modified  packet  resulting  in  a  common  key.  Mutable  fields  include  the  IP  options, 
the  TTL,  the  packet  checksum,  and  the  type  of  service.  In  addition,  the  outer  IP 
layer  (if  a  tier-2  tunneling  protocol  is  used)  may  be  completely  different  because  of 
how  the  packet  gets  routed  through  the  network,  hopping  between  tunnel  endpoints. 
By  omitting  these  mutable  fields  from  the  packet  and  only  hashing  on  the  inner  IP 
layer,  we  find  that  the  sending  side  hash  matches  the  reeeiving  side  hash. 

Figure  1  depicts  in  red  the  fields  that  are  not  ineluded  in  the  hash  beeause  these 
fields  are  mutable.  Each  hash  is  associated  with  the  data  for  its  packet  stored  in  the 
Packet-Knowledge-Temporary-Store  (PKTS)  local  to  the  Commsip  Worker 
process.  The  PKTS  is  a  temporary  storehouse  for  all  reduced  packet  data.  PKTS 
records  are  incrementally  populated  during  the  various  phases  of  the  reduction 
processing.  Each  process  records  data  into  a  separate  store.  The  data  columns  of  the 
store  are  shown  in  Tables  A-3  and  A-4  in  the  Appendix. 

Bits  4  8  16  20  32 


I  Unmutable  Values  (In  Hash) 
^  Mutable  Values  (Not  In  Hash) 


Fig.  1  IP  header  hashed  fields 


10.  Fragments 

Before  the  PKTS  data  are  recorded,  all  IP  packet  fragments  are  reconstructed.^  This 
is  done  because  fragmentation  can  occur  anywhere  along  the  network  path  and  thus 
may  change  how  the  packet  appears  on  the  reeeiving  side,  making  it  difficult  to 
match  individual  fragments  using  the  hash-based  method.  In  general,  fragments 
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appear  in  order  and  relatively  close  to  each  other  in  the  file.  A  fragment 
bookkeeping  mechanism  is  used  to  collect  the  fragments. 

As  the  bookkeeper  collects  packet  fragments  from  a  file,  it  attempts  to  reconstruct 
the  whole  packets  they  originated  from.  Fragments  from  a  file  that  do  not  fully 
recreate  a  packet  are  provided  to  the  Receiver^  process  to  be  matched  to  fragments 
from  other  files.  Packets  that  are  reconstructed  are  decoded  and  have  their 
information  added  to  the  PKTS. 

11.  Commsip  Crunch 

Once  the  Process  stage  is  complete  and  all  fragmented  packets  have  been  decoded, 
the  Crunch  stage  begins.  At  the  start  of  the  Commsip  Crunch  stage,  the  Receiver 
process  begins  to  offload  work  to  Commsip  workers.  Dividing  up  the  work  across 
the  processes  is  crucial  to  efficient  processing  of  the  data.  The  initial  breakdown  of 
the  work  is  based  on  bins  that  each  packet  is  placed  into.  The  bins  are  numerically 
defined  based  on  the  number  of  processes  (P)  in  the  HPC  job,  where  N  is 
determined  by  Eq.  1, 


N  =  ceiling(log2(P))  ,  (1) 

and  the  bins  are  defined  to  include  the  range  0  — ^  2^—  7.  As  an  example,  if  we  had 
250  HPC  processes  for  a  reduction  job,  then  N  =  ceiling(log2(250))  =  8,  and  the 
bins  would  h^O  —>  255. 

Based  on  values  calculated  in  the  Commsip  Process  stage,  bin  keys  are  sent  to  each 
Commsip  worker.  Bin  keys  are  unique  bit  strings  that  map  to  the  tailing  N  bits  in 
packet  hashes.  The  keys  identify  which  set  of  packets  each  worker  should  operate 
on. 

Considerations  must  be  observed  for  memory  for  these  packet  operations  since 
some  of  the  HPC  machines  do  not  have  swap  space.  ^  When  too  much  runtime 
memory  is  consumed,  the  node  will  end  the  reduction  job  prematurely.  To  prevent 
this,  the  number  of  packets  per  hash — thus,  the  number  of  packets  that  can  be 
processed  by  each  worker  at  a  time — is  limited  based  on  the  available  memory.  For 
current  systems,  the  upper  limit  is  set  to  200,000  packets. Though  this  limit  may 
seem  low,  it  allows  larger  hash  bins  to  be  split  up  and  processed  in  parallel  sub¬ 
bins. 

During  the  Process  stage,  Commsip  records  a  count  of  packets  per  bin  (PPB),  which 
allows  it  to  determine  when  the  limit  is  exceeded  by  any  bin.  When  the  upper  limit 
is  exceeded,  crunch  creates  a  number  of  sub-bins  (5)  according  to  Eq.  2; 
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(2) 


5  =  ■ 

This  will  normally  produce  an  evenly  distributed  amount  of  sub-bins,  each  being 
smaller  than  the  upper  bound.  The  Commsip  Crunch  process  will  offload  eaeh  of 
these  to  a  worker  and  then  wait  until  all  the  processing  is  done. 

12.  Offloaded  Worker 


An  offloaded  worker  is  one  that  reeeives  a  bin  key,  the  number  of  sub-bin  bits  used, 
and  a  sub-bin  value.  These  workers  eolleet  all  packet  data  from  the  eombined  set 
of  all  PKTS  that  matehes  the  bin  key  and  the  sub-bin  value. 

13.  Packet  Matching 

Once  all  of  the  hashes  have  been  found,  the  offloaded  worker  must  attempt  to  find 
sent  and  reeeived  paeket  matehes.  To  simplify  this,  while  pulling  the  sub-bin  into 
memory,  each  packet’s  data  are  plaeed  in  an  indexed  table  with  the  packet  hash  as 
the  key.  Eaeh  index  (hash)  ean  have  one  or  more  packet  records,  so  iterating 
through  the  table  provides  a  eollection  of  paekets  with  the  same  hash  identifier. 

Indexes  with  only  one  paeket  represent  an  unmatehed  paeket.  For  these,  the 
paeket’s  direction  flag  (pktjsoutbound  in  PKTS),  which  shows  the  packet’s 
direction,  is  examined  to  determine  if  this  is  a  received  but  not  sent  (RNS  cip_ms 
=  true)  or  not  completed  (eip  eomp  =  false)  paeket. 

When  more  than  one  paeket  has  the  same  hash  value,  then  the  set  of  paekets  is  split 
into  two  lists  (SENT  and  RECEIVED)  based  on  eaeh  packet’s  direction  flag.  These 
lists  are  then  fed  into  1  of  2  packet-matehing  proeessing  algorithms;  unicast  or 
multicast. 

14.  Unicast 


The  unicast  algorithm  is  outlined  in  Fig.  2.  Unieast  matching  begins  with  the  2 
paeket  lists:  one  for  sent  paekets  and  one  for  received  paekets. 
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Fig.  2  Unicast  packet  matching  process 

The  red  loop  depicts  what  happens  if  there  are  no  received  packets.  Each  sent  packet 
is  made  into  a  one-sided  pair.  The  pair  is  then  appended  to  a  list  of  matched  entries. 

If  instead  there  are  received  packets,  the  list  of  receive-side  packets  is  sorted  in 
ascending  order  by  collection  time.  Initially,  all  packets  in  this  list  are  considered 
to  be  “unmatchable”.  Then,  as  the  algorithm  finds  potential  matches  of  sent  and 
received  packets,  the  received  packets  are  removed  from  the  “unmatchable”  set. 
Figure  3  depicts  how  matches  are  created  from  these  3  lists. 


8 


Dale  Oder  ed  Uslol 
Receive  Packets; 
List  of  Sent  Packets 


UnmaldiaUe 

Recen/e 

Set 


Fig.  3  Creating  matched  packet  entries 

For  each  of  the  sent  packets  in  the  list,  a  binary  search  is  used  to  find  the  received 
packet  that  was  collected  the  soonest  after  and  the  one  that  was  collected  the  latest 
before  the  sent  packet.  Any  packets  found  are  considered  potential  matches  and  are 
removed  from  the  “unmatchable”  set. 

The  results  of  the  binary  search  can  produce  3  different  scenarios  as  depicted  by 
the  first  diamond  shape  in  Fig.  3. 
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1)  There  are  no  matehable  reeeivers  to  the  left  (earlier  in  time)  side  of  the 
binary  seareh  result. 

2)  There  are  no  matehable  reeeivers  to  the  right  (later  in  time)  side  of  the 
binary  seareh  result. 

3)  There  are  results  on  both  sides. 

In  ease  1,  the  algorithm  first  eheeks  if  the  right  side  had  a  result.  If  so,  a  matehing 
pair  is  made  from  the  sent  paeket  and  the  right  result.  If  not,  the  sent  paeket  must 
be  tested  to  determine  if  it  is  loeal  traffie.  If  the  sent  paeket  is  eonsidered  loeal,  then 
it  is  ignored,  and  the  next  sent  paeket  is  proeessed.  Otherwise,  a  reeeiver-less  pair 
{cip  comp  =  false)  is  ereated  and  appended  to  the  list  of  matehed  entries. 

In  ease  2,  there  is  a  left  result  but  no  right  result.  A  pair  is  made  using  the  sent 
paeket  and  the  left  result.  The  mateh  is  added  to  the  list  of  matehed  entries. 

In  ease  3,  there  is  a  result  on  the  left  side  and  a  result  on  the  right  side.  The  algorithm 
deeides  whieh  side  has  the  absolute  minimum  time  differenee  and  generates  a 
matehed  pair  with  the  sent  paeket  and  the  eloser  reeeived  paeket. 

Onee  the  algorithm  has  attempted  to  mateh  all  of  the  sent  paekets,  there  may  be 
some  reeeived  paekets  that  remain  in  the  “unmatehable”  set.  The  remaining  set  of 
unmatehable  reeeived  paekets  is  eonverted  into  reeeive-side-only  pairs,  as  shown 
in  the  tan  loop  in  Fig.  4.  The  resulting  reeeived  but  not  sent  {cip_rns  =  true)  entries 
are  added  to  the  list  of  matehed  entries. 


10 


Z  For  each 

MaldiaUe 
thatis  rxA 
Local  Traffc 


Make  Pair 

(None,  Rec\^ 

Matched 

Enties 


Fig.  4  Handling  nnmatched  packets 


15.  Multicast 


Dealing  with  a  packet  hash  set  that  is  multicast  is  similar  to  unicast,  but  the 
algorithm  must  be  taken  into  account — one  sent  packet  can  have  multiple  receivers. 
In  the  unicast  case,  a  packet  could  only  be  matched  once.  In  the  multicast  case, 
however,  it  may  be  matched  once  for  each  device  observing  a  received  copy.  The 
matching  algorithm  is  modified  such  that  the  input  list  of  date-ordered  received 
packets  is  grouped  per  device.  Then  the  list  of  sent  packets  is  traversed  in  the  same 
manner  as  in  the  unicast  algorithm  but  for  each  device  found  in  the  receive  list.  In 
addition,  the  set  of  unmatchable  receives  is  split  up  by  the  device.  Aside  from  this 
difference,  the  algorithm  works  in  the  same  way. 

16.  Offloaded  Worker  (Continuation) 

Once  ail  of  the  matches  have  been  found,  they  are  filtered  for  duplicate 
observations.  Duplicate  observations  are  defined  as  observations  that  occur  within 
.000245  s^^  of  a  bitwise  identical  packet  on  the  same  device.  After  the  duplicate 
observations  are  removed,  the  matched  packets  are  then  merged  into  the  previously 
described  Commsip  table  format.  The  latency  calculation  is  performed  during  the 
merge  of  the  2  packets.  In  addition,  specific  flags  will  also  get  set,  such  as 
cip  ismulticast  and  cip  isduplicatepkt.  The  merged  data  are  then  written  to  disk. 
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17.  Analyst  Usage 


The  packet-level  data  contained  on  the  Commsip  table  is  used  by  the  analytical 
community  to  render  many  different  types  of  data  products.  These  include 
aggregate  statistics  binned  by  time  and/or  location  within  the  network.  The  types 
of  tactical  applications  or  network  devices  can  be  derived  using  the  IP  addresses 
contained  in  each  Commsip  record.  One  sample  data  product  derived  from  the  data 
model  is  shown  in  Fig.  5,  a  Google  Earth^^  Keyhole  Markup  Language  (KML)  file. 
This  product  includes  aggregate  network  statistics  between  node  pairs.  The  white 
lines  represent  the  range  between  nodes  and  are  used  to  render  the  terrain  profile 
between  nodes  (seen  at  the  bottom  of  the  image).  The  blue  arcs  represent  satellite 
communication  links  between  node  pairs;  green  arcs  represent  terrestrial  radio 
links.  The  Commsip-derived  data  for  each  link  can  be  displayed  by  clicking  on  the 
links  (white  box  pop-up  in  upper  left  of  the  map  area). 


Fig.  5  Sample  Commsip-derived  data  product  (KML  file) 

18.  Conclusion 


The  packet-level  analysis  data  processing  module,  Commsip,  has  been  used  for 
multiple  testing  events.  As  the  tests  evolved,  the  module  has  evolved  as  well  to 
cover  new  cases  and  new  collection  points. 

The  Commsip  processing  represents  the  bulk  of  the  computational  work  required 
to  perform  packet-level  network  analysis  and  requires  significant  amounts  of 
processing  time  to  complete.  However,  with  the  parallel  nature  of  the  data  reduction 
framework  and  HPC  machines,  the  impact  on  time  is  mitigated  to  a  reasonable 
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level.  The  shortening  of  the  reduction  time  line  allows  the  analysts  to  receive  this 
useful  data  product  much  faster.  Since  most  of  the  analysis  comes  from  the 
Commsip  data  product,  having  it  in  hand  early  can  allow  them  to  determine  the 
results  of  the  test  much  faster. 
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Appendix.  Tabular  Data  Definitions 
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Table  A-1  Commsip  data  elements 


Column  Name 

Description 

cipxdate 

ciprdate 

cip_xcollpt 

ciprcollpt 

cipcomp 

Datetime  on  transmitting  side  of  when  packet  was  observed 

Datatime  on  receiving  side  of  when  packet  was  observed 

Data  collection  point  on  transmitting  side 

Data  collection  point  on  receiving  side 

Boolean:  Ti'ue  if  the  packet  was  observed  on  the  receiving  side,  False 
otherwise 

cipms 

Boolean:  True  if  the  packet  was  observed  on  the  receiving  side  but  not 
the  sending  side,  False  otherwise 

cip  totalpacketsize 

cip_payloadsize 

ciplatency 

cipxdscp 

ciprdscp 

cipxdeid 

cipxip 

ciprdeid 

cip  rip 

cip_protocol 

cip_payloadhash 

Total  size  of  the  packet  in  bytes 

Size  of  the  packets  inner  most  payload 

Latency  between  transmission  and  receipt  of  packet 

Differentiated  services  code  point  on  transmitting  side 

Differentiated  services  code  point  on  receiving  side 

Device  ID  on  transmitting  side 

IP  Address  on  transmitting  side 

Device  ID  on  receiving  side 

IP  Address  on  receiving  side 

The  protocol  the  packet  was  sent  on 

The  folded  md5sum  hash  of  the  inner  most  payload  (backwards 
compatibility) 

cipipidentifier 

cipfragmented 

cip  xttl 

cip  rttl 

cip  xsrcmac 

cip  rsrcmac 

cipxdstmac 

ciprdstmac 

ciptxip 

ciptrip 

cip  innerfingerprintid 

The  IP  identifier  on  the  packet  that  was  sent 

Boolean:  True  if  the  packet  was  fragmented.  False  otherwise 

The  time-to-live  value  on  the  transmitting  side  of  the  packet 

The  time-to-live  value  on  the  receiving  side  of  the  packet 

The  source  MAC  address  on  the  transmitting  side 

The  source  MAC  address  on  the  receiving  side 

The  destination  MAC  address  on  the  transmitting  side 

The  destination  MAC  address  on  the  receiving  side 

The  outer  most  tunnel  IP  address  of  the  transmitter 

The  outer  most  tunnel  IP  address  of  the  receiver 

The  identifying  full  mdSsum  hash  of  the  altered  inner  most  IP  layer, 
used  for  matching 

cip  inferred  x  deid 

cipinferredrdeid 

cipdaglimiteduseid 

cipdagreasoncodeid 

cip  ismulticast 

cipisduplicatepkt 

Intended  device  the  packet  was  sent  from 

Intended  device  the  packet  was  destined  for 

Used  for  marking  out  data  that  should  not  be  used  by  analysts 

Used  for  marking  out  data  that  should  not  be  used  by  analysts 

Boolean:  True  if  the  packet  is  a  multicast  packet.  False  otherwise 
Boolean:  True  if  the  packet  was  observed  at  more  than  2  locations. 
False  otherwise 

cip  xvlanid 
cip  rvlanid 
cipistunneled 
ciptipid 
cip_t_payloadsize 
cip  t payloadhash 

Virtual  Local  Area  Network  ID  of  the  sending  side  packet 

Virtual  Local  Area  Network  ID  of  the  receiving  side  packet 

Boolean:  True  if  the  packet  is  tunneled.  False  otherwise 

The  tunnel  layers  IP  identifier  of  the  packet 

The  tunnel  layers  payload  size  (contains  the  inner  IP  layer) 

The  tunnel  layers  mdSsum  payload  hash 
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Table  A-2  Commslp  additional  transport  elements 


Colnmn  Name  Description 


txpid 

txphash 

txpxport 

txprport 

txpistcp 

tcptcpseq 

txptcpack 

txpdatalen 

txptcpflags 

payfileid 

payoffset 

paylength 


13 -byte  transport  identifier  string 

Hash  of  “normalized”  transport  identifier 

TCP/UDP  port  of  packet  transmitter 

TCP/UDP  port  of  packet  receiver 

Boolean:  True  if  packet  is  TCP,  false  otherwise 

The  sequence  number  from  the  TCP  header  (undefined  if  not  TCP) 

The  acknowledgement  number  from  the  TCP  header  (undefined  if  not  TCP) 
The  total  size  of  the  TCP/UDP  payload 
The  TCP  flags  field  (undefined  if  not  TCP) 

The  internal  file  ID  of  the  file  containing  the  payload 
The  file  offset  location  of  the  payload  in  the  file 
The  length  of  the  payload  in  bytes 


Table  A-3  Packet  knowledge  temporary  store  structure 


Column  Name_ Description 


binkey 

nexteight 

pktdate 

pktcollpt 

pktisoutbound 

pkttotalpacketsize 

pkt_payloadsize 

pktdscp 

pktdevice 

pktxip 

pktrip 

pkt_protocol 

pkt_payloadhash 

pktipid 

pktisfragmented 

pkt_packetcount 

pktsrcmac 

pktdstmac 

pktfingerprint 

pktttl 

pktvlanid 

pktistunneled 

pkttipid 

pkttxip 

pkttrip 

pkt_t_payloadsize 

pkt_t_payloadhash 

pkttfingerprint 

pktinferredxdeid 

pktinferredrdeid 


The  first  n  bits  of  the  pkt  fingerprint  used  to  bin  the  data 

The  next  8  bits  of  the  pkt  fingerprint  used  to  sub-bin 

The  datetime  of  when  the  packet  was  observed 

Data  collection  point  of  the  observed  packet 

Boolean:  True  if  the  packet  is  outbound,  False  otherwise 

Total  size  of  the  packet  in  bytes 

Size  of  the  packets  inner  most  payload 

Differentiated  services  code  point  of  packet 

Device  ID  of  the  observing  ADMAS 

IP  Address  of  the  transmitter 

IP  Address  of  the  receiver 

The  protocol  the  packet  was  sent  on 

The  folded  mdSsum  hash  of  the  inner  most  payload  (backwards 
compatibility) 

IP  identifier  of  the  packet 

Boolean:  Tme  if  packet  was  fragmented.  False  otherwise 

The  number  of  fragmented  packets  that  the  original  packet  is 

comprised  of 

The  source  MAC  address  of  the  packet 
The  destination  MAC  address  of  the  packet 

The  identifying  full  mdSsum  hash  of  the  altered  inner  most  IP  layer, 

used  for  matching 

The  time-to-live  value  of  the  packet 

The  virtual  local  area  network  ID  of  the  packet 

Boolean:  Tnie  if  the  packet  is  tunneled.  False  otherwise 

The  IP  Identifier  of  the  outer  most  tunnel  layer 

The  sending  IP  Address  of  the  outer  most  tunnel  layer 

The  receiving  IP  Address  of  the  outer  most  tunnel  layer 

The  size  of  the  outermost  tunnel  layers  payload  (includes  the  inner  IP 

layer) 

The  folded  mdSsum  hash  of  the  outer  most  tunnel  payload  (backwards 
compatibility) 

The  identifying  full  mdSsum  hash  of  the  altered  outer  most  IP  layer, 
used  for  malching 

Intended  device  the  packet  was  sent  from 
Intended  device  the  packet  was  destined  for 
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Table  A-4  Packet-Knowledge-Temporary-Store  (PKTS)  added  transport  level  fields 


Colnmn  Name 

Description 

txpid 

txp  hash 

txpxport 

txprport 

txpistcp 

tcptcpseq 

txptcpack 

13 -byte  transport  identifier  string 

Hash  of  “normalized”  transport  identifier 

TCP/UDP  port  of  packet  transmitter 

TCP/UDP  port  of  packet  receiver 

Boolean:  True  if  packet  is  TCP,  false  otherwise 

The  sequence  number  from  the  TCP  header  (undefined  if  not  TCP) 

The  acknowledgment  number  from  the  TCP  header  (undefined  if  not 
TCP) 

txp  datalen 
txptcpflags 
payfdeid 
payoffset 
pay  length 

The  total  size  of  the  TCP/UDP  payload 

The  TCP  flags  field  (undefined  if  not  TCP) 

The  internal  file  ID  of  the  file  containing  the  payload 

The  file  offset  location  of  the  payload  in  the  file 

The  length  of  the  payload  in  bytes 
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List  of  Symbols,  Abbreviations,  and  Acronyms 


ADMAS 

Advanced  Distributed  Modular  Aequisition  System 

BLOb 

binary  large  objeet 

FPGA 

field-programmable  gate  array 

HPC 

high-performance  eomputing 

IP 

Internet  Protoeol 

KML 

Keyhole  Markup  Language 

MAC 

media  aecess  eontrol 

OTA 

over-the-air  [radio] 

PCAP 

Packet  Capture 

PKTS 

Packet-Knowledge-Temporary-Store 

TTL 

time  to  live 
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